MAC TR~121 . 


ON LOWER BOUNDS FOR SELECTION PROBLEMS 


Foong Frances Yao 


This research was supported . by the Mational 


Science Foundation under teenareh greet G3-3467 1. 


CAMBRIDGE 


02139 


This empty page was substituted for a 
blank page in the original document. 


CHAPTER 


CHAPTER 


CHAPTER 


CHAPTER 


1 
1.1 
1.2 


1.3 


TABLE OF CONTENTS 


PRELIMINARIES 

Introduction 

Tree Algorithm 

The Concept of a Strategy 

Crucial vs. Noncrucial Comparisons 
SELECTION OF THE THIRD BEST PLAYER 
Introduction 

Weight Function and Basic Strategy 
Nearly Exact Bounds for i=3 

2.3.1 Main Theorem 

2.3.2 Proof of Lemma 1 

GENERAL LOWER BOUNDS FOR SMALL i 
Introduction | 

Main Theorem 

Proof of Main Theorem 

Proof of Lemma 2 

3.4.1 Auxiliary Propositions 
3.4.2 The Inductive Proof 
FINDING A MEDIOCRE PLAYER 
Introduction 

Properties of S(i,j,n) 

S(1,ij,m = V,(i+2) 


Connections with Median Computation 


This empty page was substituted for a 
blank page in the original document. 


ABSTRACT 


Let V, (a) be the minimum number of binary comparisons that 
are required to determine the i-th largest of n elements drawn 
from a totally ordered set. In this thesis we use adversary 
strategies to prove lower bounds on V,(n). For i=3, our lower 
bounds determine V,(n) precisely for infinitely many values of n, 
and determine V,(n) to within 2 for all n. For a general fixed 
i, our lower bound has the asymptotic form n+ (i-1)logn - 
O(log(log*n)) where log*n is a very slowly growing function. As a 
result, the asymptotic behavior of V, (a) is determined to within 
O(log(log*n)). A more general problem is raised in which one wants 
to find an element which is (i,j)-mediocre, i.e., smaller than at 
least i elements and greater than at least j elements. For i=l, 
it is shown that the best algorithm is to select the i+lst 
largest of any subset of i+j+l elements. It is an interesting 
question whether for general i this procedure is also optimal for 
finding an (i,j)-mediocre element. An affirmative answer to this 


question would imply Vn/26™ < 3n. 


CHAPTER 1 


PRELIMINARIES 
1.1 Introduction 


In this thesis we are interested in the problem of determining 
the i-th largest element of a totally ordered set of n objects. 
We shall concentrate on selection methods in which only pairwise com- 


parisons are allowed. 


The history of this problem dates back to 1883 when Lewis 
Carroll in an essay[2] pointed out that the usual playing procedures 
of a "knockout" tennis tournament would in most cases select the 
wrong second and third best players. Therefore in the essay he set 
out to devise a plan for finding the true second and third best 
players. There are of course many ways to achieve this goal, and a 
question that naturally arises is "In how few matches (assuming 
transitivity and antisymmetry) between n players can one decide 
the i-th best player?" Let V, (n) be the answer to this problem, 
The other closely related problem is to select in order the first, 
second, ..., and the i-th best players, and the minimum number of 


matches required here we denote by W,(n). 


Some obvious properties of V and W are: 
(i) V, (n) < W, (n) 


(ii) V, (n) = Vieni ™ ; W, (n) = Wei ™ (by symmetry) 


(iii) W.(n) 2 [log (i!)], an information theoretic lower bound. 
L 
(All of our logarithms are taken to the base 2.) 


(iv) V,(») = W, (n) =en-1 (cf. also Sec, 1.4) 


In 1932 Schreier[7] gave an algorithm for finding the first and 
second best of n players that requires at most n - 2 + /lognl 
matches. This construction was generalized by Kislitsyn[3] in 1964 to 
show that 

i-2 

W,(m) sn -it > Flog(n-k)! for all i. (1) 

k=0 
His algorithm uses first i stages of "tree selection" sort. (cf. 
Knuth[4], Sec. 5.2.3) We first set up a knockout tournament of n 
players, and determine the best player in n-1 matches. We then 
"output" the champion and select the best of the remaining players. 
After repeating this procedure i times, we will have found the first, 
second, ..., and the i-th best players. Note that after the initial 
tournament is set up, at each later stage only a portion of the tour- 
nament needs to be reconstructed in order to find the new champion. 


One can in fact show that the first i stages require at most the 


number of matches as shown on the right hand side of (1). 


If only the i-th best player is desired, Hadian and Sobel[5] 
proved that 


V, (n) sn-i + (i-1)llog(n-i+2)1 for all i. (2) 
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Their construction is as follows: One sets aside a set S of i-2 
players and perform a knockout tournament on the remaining n-i+2 guys. 
(This requires n-i+l matches.) The champion of this tournament ranks 
higher than n-i+l players, and hence is too good to be the i-th 
best player. We replace him by a player from S. One rebuilds 

part of the tournament as needed to determine the new champion (which 
requires at most [log(n-i+2] matches). Again, we replace the cham- 
pion after finding him. After i-1 passes, we have successfully 
eliminated i-1l players that are overqualified. Now the champion 

of the remaining n-i+l players in the tournament must be the true 
i-th best player. (The last step takes [log(n-i+2)| - 1 matches.) 


Summing the matches, we get Equ.(2). 


For small i, both of the algorithms described above seem to be 
rather efficient. However, they require O(nlogn) comparisons when 
i= O(n). For a while it was not clear whether any efficient algori-~ 
thms might exist in the case of large i. Finally, in 1971 Blun, 
Floyd, Pratt, Rivest, and Tarjan[1] proposed a uniform procedure 
which can find the i-th largest of n elements in 5.73n compari- 


sons for all i. 


A very challenging task in the theoretical study of algorithms 
is to prove lower bounds on the complexity ofa given problem. Until 
recently little was known about lower bounds in the tennis tournament 


problem. The first step was made in 1964 by Kislitsyn who showed, in 


the same paper cited above, that n- 2+ /lognl is in fact the mini- 
mum number of comparisons required by any algorithm to find the largest 
and the second largest elements. Therefore, Schreier's construction 
gives an optimal algorithm. In 1971 Blum et al[1] proved that 
n+i+ [lognl - 4 is a lower bound for selecting the i-th largest 
element where i < n/2. Recently this result was further improved by 
Pratt[6] to give 

Vv, (ny) =>n+ 2i - logn for i <n/3, 


and V,(m) > (3mti)/2 - logn for n/3 si < n/2. (3) 


Impressive as this bound is, it still leaves a considerable gap be- 
tween upper and lower bounds when i is small. As a matter of fact, 
the asymptotic behavior of the Hadian-Sobel upper bound (2) is 

~ n+ (i-l)logn while that of the previous lower bound is ~ n+ logn. 
It is therefore an intriquing problem to determine what the true 


asymptotic behavior of the optimal algorithm is. 


In this thesis, we present lower bounds for the selection of the 
i-th largest element and the selection of the largest i elements (in 
order). For i small relative to n, the gap between upper and lower 
bounds for either problem now has the leading term (i-1) (i-3)log(log*n) 
where log*n is a very slowly growing function. (cf. its definition 
in Sec.3.2) In the case of i = 3, we have a tighter lower bound 
which actually determines V,(n) precisely for infinitely many values 


of n ; and determines V3(n) and Wy (n) to within 1 or 2 for all 


1.2 Tree Algorithm 


Let Xys eres X be n distinct numbers. To select the i-th 


largest of x,, .«.-, x is to determine x_, 1l<r<n, such that 
1 n Yr? 


{ x, |* > x} zi-1 and f x, | < x} zn-i. 


We shall write Fy (x,5 caey x) =r to denote that x is the i-th 
largest of Kyo coer Xe To sort Kyo coos Ks» then, is to compute 
F, (xy. sae, x) simultaneously for l1<i<n — or equivalently, to 

determine a permutation p(1)p(2)+***p(n) such that x 


p(t) * *pc2) 7 


eee > * stn)" We write G(x, aay x) = p for the permutation p 


which satisfies this condition. 


It is quite obvious that if Kyo sees KX, and Yyo coos Y, are 
two sequences each consisting of n distinct numbers, with the pro- 
perty that (Vj)(Vk)[ x, <x > y; < ye ] , then we must have 
F,(%,> aes x) = F(x, ae x) for all i, and G(x,> i eety x) = 
G (yy, eats y)° In other words, for operations such as sorting or 
selecting the i-th largest element, the answer depends solely on the 
ordering relation between the input elements. Hence, for this class 


of problems, it is natural to study computing techniques that are 


based entirely on pairwise comparisons between the input items. 


An algorithm that satisfies the above constraint can be repre- 


sented by a binary tree structure, such as that shown in Figure 1, and 


will be called a tree algorithm. Each internal node contains two 
indices "i:j", denoting a comparison of the i-th input x, versus the 
j-th input * For simplicity, we shall always assume that all inputs 
are distinct, so that each comparison may have two outcomes. The left 
subtree of this node then represents the subsequent comparisons to be 


made if Xs <x,, and the right subtree describes the succeeding moves 


j 


when x; > *,° Since we will be interested in minimizing the number 


Figure 1 A tree algorithm for selecting the 


second largest of four elements 


of comparisons needed, we may assume that no redundant comparisons 
occur in a tree algorithm. Thus the left and right subtrees of an 
internal node must both be accessible, and hence all external nodes 
are reachable from the root. If this algorithm is to compute the i-th 
largest of mn items, then each external node of the tree shall con- 
tain an index r denoting the fact that Fy (xy, ers x) =r has 
been established as a result of the comparisons made along the path 
from the root to this node. Similarly, for sorting algorithms each 


external node shall contain a permutation which equals G(x,, ...,X_). 
1 n 


As a complexity measure, we shall define the cost of a tree 
algorithm to be the maximal path length of its comparison tree. 
Thus, the functions V, (n) and W, (y) that are informally defined 


in the last section can be expressed as 


V, (n) = min {cost of A| A is a tree algorithm that computes 
F(x, eee, x3 
W,(n) = min {cost of A|A is a tree algorithm that computes 


F(x; re x) simultaneously for 1sjsi } 
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1.3 The Concept of a Strategy 


The cost of a tree algorithm can be considered from a game-theo- 
retic point of view. This approach, as we shall see, enables us to 
obtain tight lower bounds on the minimum number of comparisons required 


to do various selections. 


For example, we may look at the computation of the i-th largest 
of n items as a two-person game in the following way. There is 
player A who makes the opening move, and whose move shall always be an 
inquiry of the form "Is Xe less than Ay tes The adversary B, in 
his turns, shall give a reply of either "yes" or "no" to the ques- 
tion that A just asked, The game ends when A can successfully deter- 
mine the i-th largest element. The payoff of B in this game is 


defined to be the total mumber of questions asked by A. 


Now, recall in game theory, a player's "strategy" is simply a 
complete specification of what he would do in each situation that 
might arise in the play of the game. In the game described above, any 
strategy that A may employ corresponds to a tree algorithm for select- 
ing the i-th largest element, and vice versa. Once A has selected a 
strategy (i.e., a tree algorithm is given), the choice of a strategy 
on B's part will then completely determine the path that the computa- 
tion is going to follow in the comparison tree of this algorithm. 


Therefore, if we can show that for any strategy A may use, B can 
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always find an "adversary strategy" that quarantees a payoff of at 
least L, (n) for B, then we in fact will have proved that 


V, (a) = L,(n)- 


Notice that, as far as proving lower bounds for tree algorithms 
is concerned, adversary B has the privilege of consulting the strategy 
which A has chosen while he makes up his own, However, all the adver- 
sary strategies that are considered in the following chapters do have 
uniform prescriptions, i.e., decisions are solely based on previous 


moves and do not depend on the knowledge of A's strategy. 


In giving descriptions of strategies and situations that might 
arise in a computation, it is convenient to use the terminology of 
tennis tournaments. After all, this is where the selection problem 
originated! So, let us regard each imput item as a tennis player. 


A comparison between x, and = then becomes a "match" of x, 


versus aie If the outcome is x, < tes we say " 


x, is defeated by 
#505 if Xs > X 5° we say "X beats ge etc. (Here we have to 
assume that tennis skill satisfies transitivity, i.e., if Xe beats 

x, and ar beats Xs then x would beat x,+) A sequence of matches, 
each tagged with its outcome, is called a tournament, usually denoted 
by T. The length of a tournament T, written as |T{, is the total 
number of matches T contains. Thus the payoff of adversary B corre- 


sponds to the length of the resulting tournament. In connection with 


a tournament T, we shall let time t refer to the point when the t-th 
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1.4 Crucial vs. Noncrucial Comparisons 


With respect to a tournament T, let us write ae < x. if in the 
final ranking a is determined to be dominated by x, We write 


< if eith < 1 = k. t that 
x; x, if either ar re k. It is easy to see tha 


1) #if x Xs then in some match x, must be defeated by an my 
such that x, <x. 

j r 
2) if x< Xe then in some match x, must beat an * such that 


xX < X.. 
r Jj 


Now, if T selects the i-th best player, there must be a player x 


such that 


fx, | x < x j= nei and | fx, |x, < x} /= ; ee 


Hence for each player other than x. we must be able to isolate a 
match of either type 1) or 2), which we call a crucial match. Thus 
T must contain n-1 crucial matches, Alternatively, if we look at 
the Hasse diagram that represents the partial ordering of the players 
as established by T, n-1 crucial matches are those required to form 
the links of a spanning tree on the n nodes, with x. looking like a 
"bottleneck". (cf. Figure 2) Thus we know that 

V,(n) zne-l for all i. 
Incidentally, this proves V, (9) =n- 1 since it is easy to see that 


V4) <n- 1. For general i, in order to improve this estimate on 
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the lower bound of V,(), the adversary must try to entrap A into 


making a large number of comparisons that are noncrucial. 


Figure 2 Crucial matches (solid lines) and noncrucial matches 
{dotted lines) for selecting x. as the fourth best 


of 12 players. 
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CHAPTER 2 


SELECTION OF THE 


THIRD BEST PLAYER 
2.1 Introduction 


The adversary approach, as introduced in Sec. 1.3, was used by 
Knuth (Sec. 5.3.3 of [4]) in proving Kislitsyn's theorem that n - 2 + 
F lognl comparisons are indeed necessary for finding the second largest 
of n items. In this chapter, we present Basic Strategy which is a 
slight modification of the strategy Knuth used. We shall first use 
Basic Strategy to reconstruct Knuth's proof, and then proceed to estab- 
lish lower bounds for V,() and W,(n). Lemma 1 describes a specific 
situation that must arise in any computation controlled by Basic Strategy. 
By taking advantage of this situation, one can create a large number of 
noncrucial comparisons, and lower bounds for V,(n) and Wa (n) are obtained 
thereby. As a consequence, V3 (n) is determined precisely for infinitely 
many values of n, and to within 1 or 2 for general n. Wa (n) is also 


determined to within 1 or 2 for all n. 
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2.2 Weight function and Basic Strategy 


In Sec. 1.3 we introduced the notion of using strategies to 
establish lower bounds for selection problems. The Basic Strategy, to 
be defined. below, shall play an essential role in the forming of our 
final strategies for the adversary, It is a modification of the strategy 
Knuth used in proving Kislitsyn's lower bound on V,(n). First we 
have to define a- weight function: 

Definition 1 With respect to a tournament T, Q, (x) where x is any 
player and Ost<|T| , is a positive integer known as x's weight at 
time t , and is defined recursively as follows: 

Ql. Qg(*) = 1 for all x. 

Q2. If x inflicted y's first loss in the t-th match of T, then 

Q.(0) = Q,400 +9467), and Q(z) = Q(z) for all z 

such that 24x. 

Q3. If the loser of the t-th match was previously beaten, then 


Q, (x) = Q.. 4) for all x. 


We can now present 
Definition 2 Basic Strategy 
Assume x plays y inthe t-th match of T: 
BS1l. If x is yet undefeated and y is not, then let x win. 
BS2. If both x and y are undefeated and Q, 46%) > Q4™ ; 
then let x win. 


BS3. In other circumstances, let the outcome be arbitrary. 
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Definition 3 A tournament is said to be BS-ruled if the outcomes of 


the successive matches are decided by Basic Strategy. 


The following are some central effects of our Basic Strategy. 
Fact 1 In a BS-ruled tournament, once a player is defeated, his 
weight does not increase from then on. 
Proof By the definition of Q, (x) , @ player's weight increases only 
when he makes a first loss. From BS1l, a player who has been defeated 
cannot inflict any first loss, therefore his weight shall stay the 
same. 
Fact 2 Ina BS-ruled tournament, a player's weight is at most ox 
after he makes k first defeats. 
Proof A player's weight increases only with every first defeat he 
makes. By BS2 and Q2 , his weight at most doubles when he does 
make a first defeat. Since the initial weight is 1, Fact 2 is 


proved by induction on k. 


In the definitions of weight and Basic Strategy, only a player's 
first defeat is relevant. Indeed, we can represent the players' 
weights in a BS-ruled tournament T by a tree structure, called the 
first-defeat-tree (abbr. FDT) of T. Every player is represented by 
a node in the FDT, and y is a sonof x iff x inflicted y's 
first loss in T. Thus the FDT may be a disjoint union of several 
rooted trees, with every root representing an undefeated player. (cf. 


Figure 3) In view of Fact 1, we shall let Q(x) denote the maximum 
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value of x's weight in a BS-ruled tournament. Then we have 

Fact 3 Given a BS-ruled tournament T, let D(x) be the set of x's 
descendants (including x itself) in the FDT of T. Then Q(x)= [ pcx) |. 
Proof Argue by induction on the number of first defeats x makes. 


Use Q2 and Fact 1. (Details omitted). 


Thus, by Fact 3, Q(x) corresponds to the total number of nodes 
(or the actual "weight") of the subtree rooted at x inthe FDT. For 


example, in Figure 3 we have Q(x)=10 and Q(y)=4. 


Figure 3 First-defeat-tree of a BS-ruled tournament 
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It is now a simple matter to prove 
Theorem (Kislitsyn) V,(n) 2n- 2+ Mog nl 
Proof Given any tree algorithm for selecting the second best player, 
we apply the Basic Strategy to it. Note that the champion is also 
determined in the resulting tournament T (he is the singular player 
dominating the selected second best), call him Xe Since he must 
be the only undefeated player in T, in the FDT of T he is hence the 
unique root. By Fact 3, we have Q(x.) =n, Therefore x. must 
have made at least [log nl first defeats in T. However, at most one 
of these could be a crucial match (cf. Sec. 1.4) as far as selecting 
the second best player is concerned. Since there must be n- 2 
other crucial matches, we conclude that T 2n- 2+ log nl. This 


proves that Vy(n) >n- 24 /log al 
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2.3 Nearly Exact Bounds for i=3 


In this section, we proceed to prove fairly tight lower bounds for 
V,(n) (the minimum number of comparisons for selecting the third largest 
element), and Wa(n) (that for selecting the first, second, and third 


largest elements in order). 
2.3.1 Main Theorem 


The main result of this section is the following theorem: 


Theorem 1 v4) 2 H(n), where for k 2 1, 


n - 3 + 2flog(n-1)! if n= 2K, 1 
H(n) =(n - 4 + 2! log(n-1)| if 32-1 ns gt 
n - 5 + 2! log(n-1) 1 if Me1<ns 3. 2k-1 


By comparing H(n) with Hadian and Sobel's upper bound (cf. Sec. 
L.1) V4(n) s<n-3+ 2! tog(n-1) | we see that V,(n) is determined 
precisely if n= ok, 1; and is determined to within 1 or 2 for 
general nn. Since Wan) 2 Vq(n) 2 H(n) , comparing H(n) with 
Kislitsyn's upper bound Wa (n) <n- 3+ /log nl + Flog(n-1)1 shows 


that wa) is determined to within 1 or 2 for all n. 


The following lemma is the key to the proof of Theorem 1. To 
simplify notation, let us define 
h(n) = H(n) - n+1. 


It may be helpful to keep in mind that h(n) is an estimate of the 
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number of noncrucial (i.e., wasted) comparisons that the adversary 


is aiming to create. 


Lemma 1 Any BS-ruled tournament that selects the third best of n 
players must reach a point in time t when one of the following two 
situations arises: 

(1) There exist 3 undefeated players A, B, and C; A and B toge- 
ther have made at least h(n) first defeats so far. 

(2) There exist 2 undefeated players A' and B' who together 


have made at least h(n) + 1 first defeats so far. 


We first show how Theorem 1 can be proved by using Lemma 1. The 


proof of Lemma 1 will be given in Sec. 2.3.2. 


Proof of Theorem 1 We present a strategy which, when applied to any 
algorithm for selecting the third best player, will result in a tour- 
nament that contains at least H(n) matches. The strategy has two 
phases: 
Phase I Basic Strategy 

Follow Basic Strategy until either situation (1) or situation 
(2) of Lemma 1 occurs, At this point t , switch to 
Phase II Clear Strategy 

CS1. As a follow-up strategy for situation (1), let A, B, C 
always win when they play other players. Among A, B, C we shall 


assign the order A>C, B>C. In other cases we do not care. 
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CS2. As a follow-up strategy for situation (2), let A', B' 
always win when they play other players. In other cases, we do not 


care. 


Now, in the case of situation (1) followed up by CS1, player 
C will necessarily be selected as the third best player, dominated by 
A and B. But then, none of the h(n) metehes as mentioned in 
situation (1) is a crucial match as far as selecting the third best 
player is concerned. Hence the entire tournament must contain at 


least n- 1+ h(n) = H(n) matches. 


Similarly, in the case of situation (2) followed up by CS2, 
A' and B' necessarily become the two top winners of the tournament. 
Among those h(n) + 1 first defeats made by A' and B' before time t, 
one could be the first defeat of the third best player. However, this 
is the only case where those h(n) + 1 matches may contain a crucial 
one. Since there ought to be n- 2 other crucial matches, we see 
that the length of the tournament is at least n- 2 + (h(n) +1) , 


which equals H(n) . Theorem 1 is thus shown: to follow from Lemma 1. 
2.3.2 Proof of Lemma 1 


It is clear that any BS-ruled tournament that selects the third 
best player must fall into one of the following two classes: 
(i) Those in which there is only one undefeated player left at the end. 


(ii) Those in which there are two undefeated players left at the end. 
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For the proof of Lemma 1, we need only consider case (i). Indeed, 
suppose the lemma is proved in this case. Let T be a tournament of 
class (ii), so that two undefeated players say x and y are found at 


time Ir]. If Qin ™ <Q (x), we may extend T by letting x play 


\T| 
and defeat y in an additional match. Call the resulting tournament 

T'. Clearly T' has been following Basic Strategy, and moreover is in 
class (i), so the lemma is true for T'. However, the particular time t 
in T' as characterized by the lemma must satisfy t <[T|, since at time 


|r |+ 1 there is but one undefeated player x, not satisfying either 


assertion (1) or (2). Hence the lemma must be in fact true for T. 


Let us now look at the first-defeat-tree of a tournament of class 
(i). (ef. Figure 4) The only undefeated player, namely the champion, 
is denoted by X° Let Kqr Xyo ceee X, be the sons of Xo arranged 
in the order their first defeats by x. took place. Thus if ty ; 
Osi<xs , is the time x, inflicted x, "8 first loss, then tos ty< see 


< cS For Os<jss, let ¢, denote the number of first defeats made by 


= (By BS1, they must all take place before time eo, 


Claim Let M 


max { j + d, | O0<i<s} 


and gL 


min{f j | O0<sj<s,j+d,=™), then 
(1) if £< s , we must have M2 h(n). 


(II) if &£= s , we must have M2 h(n) +1. 


Note that once Claim is proved, Lemma 1 will follow immediately. 


For, in case (I) is true, we can choose t to be the time ty 1. 
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At this moment, Xi» Xy and Xoay (note £+1 < s ) are all undefeated 


: -Le z . , : : 
since ty 1 ty toy . By this time Xx. has made £ first defeats 


(namely on Xo Xyo cee Xy_)> and Xp has made dy . Since by (1) 


L+ d, = M2 h(n), assertion (1) of Lemma 1 is fulfilled if we choose 


A, B, C to be x % 2 and x - On the other hand, if (II) of Claim 


£+1 
is true, we can choose t_ to be t- 1. At this time x, and x, are 
undefeated, and have made a total of s +d. =M2 h(n) + 1~= first de- 


feats. Thus choosing A‘ and B! to be x, and x. will satisfy asser- 


tion (2) of Lemma 1. 


Figure 4 FDT with a single root 


25 


So we have reduced the proof of Lemma 1 to the verification of 


cases (I) and (II) of Claim. 


Proof of Claim 

(I) Assume £<s8. 

By the definition of M, we have j + d, <M for all Osjss. In par- 
i Ss ~ q ] = 

ticular, dra M M/2 [M/2] 


dresolar © MO ({mM/21+1) = [M/2j - 1 


d sSM-s 
s 


d, 
By Fact 2, Q(x,) < 24 for all O<j<s. Hence we have 


[M/2] 
WXryya) < 2 


(M/2] -1 
le PIES ee = 


M-s 
Q(x.) =) 2 


Summing the weights, we get 


Ss 


ZL _Q(x,) < 2 
i=lM/2] _ taal 


IM/21 , oIM/at-2 og oh 4 2° 


1 (4) 


Also, x has only made j first defeats prior to time ty hence by 
Fact 2 of Sec. 2.2 has weight at most 2) before he plays a This 
implies that Q(x;) < 2), for otherwise x, would not lose to x, by 


BS2 of Basic Strategy. In particular, this is true for x, such that 
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Osj<!M/2!. Therefore, 


IM/2t-1 


a0 _/M/2! 7 


1 (3) 


On the other hand, since in the first-defeat-tree the union of the 
subtrees rooted at x Osjss, contains every node except xs by 


Fact 3 of Sec. 2.2 we have 


gus 


Q¢x,) = (6) 


Now, from Equs. (4), (3) and (6), we obtain 


gim/2l | JIM/241_ gg 


Solving this inequality for M and compare with the definition of 
h(n), we conclude that 
M2 h(n) . 


(II) Assume £=s., 


By the definition of £, we have M=s + a but j + a, $ M for all 
j<s. Hence, dro <M [/21 = |M/2] 
d < _ 
lm /21+1 Bea) = 


Again, by Fact 2 we have 
giM/2i-1 , 4 girs, ote 


9 IM/2) 7) 


8 
2 Q(x.) s 


As a result, we get 


Mzh(n) +2. get: 
Thus, Claim has been proved, and the: ¢ ‘a 


pf of Tamme 1 is now complete. 
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CHAPTER 3 
GENERAL LOWER 


BOUNDS FOR SMALL i 


3.1 Introduction 


In this chapter, we shall generalize the techniques of 
Chapter 2 to deduce lower bounds on V,(n) and W, (a) for small i. 
Although we appeal to the same scheme of using a two-phase strategy, 
the analysis involved for i > 3 is considerably more difficult. 
The major effort is contained in the proof of Lemma 2, which is 
a generalization of the analysis we did on "first-defeat-trees" 


in Lemma 1 of Sec.2.3. 


For any fixed i, our lower bound for Vv, () and W, (a) has the 
asymptotic behavior n+ (i-l)logn - O(log(log*n)). By comparing it 
with upper bounds as given by Equs.(1) and (2) of Sec. 1.1, we see 
that the asymptotic behaviors of both V, (a) and W, (nm) are deter- 


mined to within a term of order 0(log(log*n)). 
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3.2 Main Theorem 


We shall derive lower bounds on V, (n) and W,(n) for i> 3. 
by induction on i , based on the result established for i= 3 in 
the previous chapter. The resulting formulas for both of our lower 
bounds involve a quantity which is dependent on i as well as on n. 
This quantity, formally defined below as h,(n), is fairly close to 


log(n) when i is small relative to n. 


Notation For m2, let log*n denote the largest integer k such 
that 2 

n2 2? } k : 
Thus, log*2 = log*¥3 = 1, log*4 = log* 5 = e** = log*15 = 2 , and so 


forth. 


Definition 4 For any i2 3, define h, (a) by the following formula: 


h(n) = logn-2; (8) 
for i> 3, h, (n) = logn - (i-3)log(log*n) - 2log((i-1)!) - (i-1) . 


Our main results of this chapter are contained in the following 


theorem: 
Theorem 2 For any n and i such that n>i 23, we have 
V,(m) 2n- i+ (1-2) Ch, (n) - 1) (9) 


W,(n) = n- 4+ (4-1)h, (n) (10) 
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Remark 1 If we define h, (n) = h, (n) = Tlogn!, then the lower bounds 
as given by Theorem 2 also hold when i=1, 2. In fact, they are close 


to the precise bounds that are known in these cases: 


n-il 


V,(m) Wy (a) 


V,(n) = Wj(n) = n- 2+ TLognl (Kislitsyn's Theorem) 


Remark 2 For i=3, Theorem 2 gives the following lower bounds: 


V3(n) 2n- 3 + 2(logn - 3) 
=n-~+ 9 + 2logn., 
Wa (n) =n- 3 + 2(logn - 2) 


=n- 7 + 2logn. 


These are slightly worse than the lower bounds we actually proved for 
i=3 in Theorem 1 of Chapter 2. The reason is of course that here we 
would rather use a varied (and slightly weakened) form of Lemma 1, if 
only it could serve more conveniently as a basis for inductive argu- 


ments. (cf. Lemma 2 and its proof in Sec.3.4.2) 


Remark 3 Compare the lower bounds of Theorem 2 with the upper 


bounds of Hadian-Sobel and Kislitsyn respectively. From (8), we have 
h, (n) > Mlogn] - (i-3)log(log*n) - 2log((i-D!) - i 


since log + 1> Mogn| - Hence by comparing Equs. (9), (10) with 
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(1) and (2) of Sec.1.1, we see that the gap between upper and lower 


bounds on Vv, and W, (a) is now less than 
(i-1) (i-3) log(log*n) + (i-1) [2log((i-1)!) +i4¢4 ]J . 


As a consequence, for any fixed i > 3, the asymptotic behaviors 


of V, (n) and W, (a) are determined to within 
(i-1) (1-3) *Llog(log*n) + 0(1) . 
Remark 4 For large n and i, since 
log((i- )!) = log(i!) ~ i(logi) , 


we see from the definition of h, (n) in (8) that 


h(n) = logn - i(2logi + log(log*n)). 


Hence h, (n) approaches zero at iw (logn)/(2log(logn)). 


Remark 5 When h, (n) gradually approaches zero or even becomes 
negative as i grows larger and larger, formula (9) is superseded by 


Pratt's lower bound as given by Equ.(3) of Sec.1.1: 


Vv, () =nt 2i - logn for isn/3, 


V,(n) > (3mti)/2 - logn for n/3zs is n/2. 


Also, (10) may be replaced by the following lower bound on W, (a): 
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Theerem 3. For any n and i such that n2 i, 


W, (a) 2n-i+t Mog(it)] ‘ 


Proof This is purely by information-theoretical arguments. Suppose 

in a tournament Ay> sees A, (unordered) are known to be the best i 
players, and the order of the remaining n-i players is fixed. 

Then, just in order to completely determine the ranking of these i 

top players, Mog(it)] matches between them could be necessary in the 
worst case, However, for each of the remaining n-i players, there still 
must be a match in the tournament in which he lost to someone whose 
ranking is no higher than the i-th, Hence these n-i "crucial" 

matches are completely disjoint from those Mog(il)1 mentioned 

before, and this tournament contains a total of n - i + /log(il)| 


matches at least. 
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3.3 Proof of Main Theorem 


The proof of Theorem 2 depends on the following lemma the way 
Theorem 1 of Sec. 2.3 depends on Lemma 1. Indeed, Lemma 2 gene- 
ralizes the analysis of the first-defeat-tree as was done in 
Lemma 1. Here a "championship" tournament is a tournament which 


determines, among other things, the best player of all. 


Lemma 2 Let T be a BS-ruled championship tournament of n players. 

For any i such that i2 3, we assert that there must exist a time 

t in T when the following two statements are true: 

(i), There are <i-l undefeated players who have made a total of 
(i-1)h, (n) first defeats at least. 

(2), The champion himself has made at least h, (n) first defeats. 

(The second statement serves mainly as an induction hypothesis that is 


needed for the proof of Lemma 2. ) 


As in Sec. 2.3, let us first assume that the lemma holds and 


prove Theorem 2. The proof of Lemma 2 will be given in Sec. 3.4. 


Proof of Theorem 2 


Given an algorithm for selecting the i-th best player, we can 


find the champion by using at most V,(i-1) = i-2 additional matches 
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to determine the best among the top i-1 players. This means that if 
Vv; (a) is the minimum number of comparisons for selecting the largest 


and the i-th largest element, then 


V,(m) 2 Vi(m) - (i ~ 2) 


1 
and W, (a) > V; (n) - 


Thus, to prove Theorem 2, it suffices to show that 


Vv; (a) z>n-it (i-1)h, (n) ° 


Given an algorithm for selecting the champion and the i-th best 
player, we first apply Basic Strategy to it. By Lemma 2 there must be 
a crucial moment t when there are j ( 1sjsi-1 ) undefeated players 


Ay> eee, A, who have made a total of (i-1)h, (n) first defeats at 


j 
least. As soon as this occurs, we shall switch. to a Clear Strategy. 
Under the latter strategy, Ay» ans A, shall always win when they 
play any of the remaining n- j guys; in other circumstances the 
outcome may be arbitrary. Now, Ay» eoey ar necessarily turn out to 
be the j highest ranking players of the tournament. However, as far 
as determining the i-th best player is concerned, V, (n-it1) =ne-i 
matches have to be played between the n-i+l low ranking players in 
order to determine the best among them, Since these players are com- 
pletedly disjoint from the j highest ranking players, we see that 
the tournament must contain no fewer than n- i + (i-1)h, (nm) matches 


all together. 
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3.4 Proof of Lemma 2 


3.4.1. Auxiliary Propositions 


Lemma 2 will be proved by induction on i. In trying to proceed 
from say i=q to i=qtl, the problem of delicate "timing" in the 
tournament turns out to be of vital importance. Propositions 1 and 2 
below are two instances of how one deals with this problem in the 


proof of Lemma 2. 


As in the proof of Lemma 1, we consider the first-defeat-tree of T, 


the given BS-ruled championship tournament. Let x x, be those 


0’ e@eey 


players whose first defeats are made by Xs the champion, at time 


t ee te respectively with t,<t 


1 0 1 
the number of first defeats made by 5° Osjss. (cf. Figure 4 of 


< eee <t .,. Also let d, be 
8 j 


Sec. 2.3.2) Furthermore, for any xs. Osjss, let T, be the "subtour- 
nament" of T consisting of those matches (in the order they occur in T) 
represented by the edges of the subtree, rooted at aa » of the FDT. 

Let T denote the subtournament consisting of those matches of Tyoeees 
Ze 


together with the first defeats of x X54 (also in the 


-1 poe 
order they occur in T). (cf. Figure 5) One can easily see that a 
and 7, ,» when considered as tournaments themselves, are both BS-ruled 


championship tournaments, Within T, both subtournaments ee and TY 


are completed before time iz . 
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Figure 5 Subtournaments a and T 


In terms of the notations defined in the last paragraph, we have 
Proposition 1 Suppose Lemma 2 is true for i=q23. If for a: 
given BS-ruled championship tournament T (of n players), there exist 


j and m satisfying the following conditions, then Lemma 2 is also true 


in T when i = qtl: 
(1) Osjss and msn, 


(ii) Q(x, ) 2m and a = bagi) ¥ 
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Proof Since x, defeated x at time ty by Basic Strategy we 
must have 


8 1% 2 pea . 


Since 


Meet ee Q(x5) 2m, 


this means that both tournaments T, and TN involve at least m 


players. 


By our induction hypothesis, there exists a time t in T, when 
OO), some <q-l undefeated players in T, have made a total of 
Cae) eG) first defeats at least; 
Sor x, (the champion of T,) by this time has made at least ho 


first defeats. 


Similarly, there exists a time t' in tT when 
os —— same as statement cor , only T, is replaced by tT : 


key —— same as statement (2), » only zm is replaced by xo 


Now, if t'< t, then (2) is also true at time t. Hence at 
time t, by 6 and pe , there are <q undefeated players who 
have made a total of at least ae first defeats. Since by 
assumption he) = Boar) this together with statement (294 above 


show that both assertions of Lemma 2 are true in T for i=qtl . 
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On the other hand, if t<t', we will choose the time to be t', 
Then statement oe and or above show that at time t' some <q 
undefeated players have made a total of at least qh tm) 2 7h, 446) 
first defeats. Again, this together with 2 show that Lemma 2 


is satisfied for i=q+1. 


Proposition 2 Suppose Lemma 2 is true for i=q2 3. If fora given 
BS-ruled championship tournament T of n players, there exists an m and 
g indices samed a, dyorseodgiy that satisfm the following 
conditions, then Lemma 2 is also true in T when i=q+tl : 

(i) msn, OSS i, < jy Sy Se. 


q-1l 


(ii) as 2m for Isf<q-1 , and a+ (q-1)h (m) 2 q°ho yf) ; 


(cf, Figure 6) 


= s and aw 2 logn; 


Figure 6 A tournament of Proposition 2 
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Proof Since QCx, )} 2m for i1sf<q-1, this means that each of the 
£ 
q-1 subtournaments or Bi wiateeny a involves at least m players. 
1 q-1 


Applying the induction hypotheses, we see that for Iisf<q-l, 


there exists time th < e » such that at time t' the follow- 
4 


$ 
ge g 
ing two statements are true: 
he Some <q-1 undefeated players in T, have mada a total of 


L 
(q-1)h, (m) first defeats at least. 


(ye x, himself has made at least hy (a) first defeats. 
g 


There are two cases: 


(1) th < ty 1 for all Isf<q-1. 


If this is the case, then at time i 1, % has made a first 
or ct? XyeD* Also, since t, = ty > to keg 

L 
x, are all undefeated at time f> 1. Hence, by cy, » we know 


£ 


that at time ‘> i, X, and Xp for isf<q-l are q undefeated 


players who have made a total of a+ ta 2)h, Go first defeats , 


defeats (namely on x 


Since by assumption 
a2 logn , hence a> Baa) P 


and a+ Cada sa) 2 qr hoy (mn) ; 


we see that Lemma 2 is true for this tournament when i=qtl . 


(ITI) th = S 1 for some 2 such that Ii<fsq-1. 
For this particular £ , x, at time th has again made at least 


w first defeats. And by statement com > at time ty there are <q-1 
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undefeated players who together with x, have made at least 
a + kaa) Gm) = gem) first defeats. Hence again Lemma 2 is true 
when i=q+l for this tournament. Thus the proof of Proposition 2 


has been completed. 
3.4.2 The Inductive Proof 


We are now ready to prove Lemma 2, 
A) Basis of induction, i=3. 

Let us look at Figure 4 of Sec. 2.3.2. Suppose Lemma 2 were 
false, we would have j + d, < 2logn - 4 for all j such that 
logn - 25 j, <8. ( If not so, we can choose the time to be pt where 
j 2 logn- 2 and j + a, = 2logn- 4 = 2h,(n). At this time oa pe 


both statements (), and (2), of Lemma 2 are satisfied.) Thus, 


trogeica. Steen -(Tognl - 2) < Mognl - 2 


qHoala = Mognl - 3 


d. < 2!llognl -4- 58 
From Fact 2 of Sec. 2.2, we have 


+ eee +2 + 2° 


“A 


8 
5 Ox) 9! lognl -3 ‘ ,| logn!-4 1 


llognl-2 J 
9! lognl-2 4 (11) 


(Note, if [lognl - 2 < 0, then h(n) = 0 for all i 2 3, and Lemma 2 is 


trivially true in this case. Hence we may assume n> 4.) 
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From (ll)and Equs. (5), (6) of Sec. 2.3, we get 


Mognl-3 s 
n-l= p>, Q(x.) + D2) Q(x.) 
j=0 J j=llognl-2 J 
< 9! logn! ~2 ee 9! logn! -2 i 
sn- 2 


which is a contradiction. This proves Lemma 2 for i=3. 


B) Suppose the lemma is true for i=q23, we will proceed to prove 


it for i= qtl. 


If for the given tournament, s 2 q*logn , then at time say t ‘ 


s-q 


x and A Xo are q undefeated players where 


c Xg-qtl? *3-q42? 


Xx. alone has made s-qtl first defeats (namely on Kye cess %.q)* 
Since s-qtl > q*logn - q = q(logn - 1) = qth () » Lemma 2 is 


true for i=q+l. Hence from now on we shall assume s < q*logn . 


Notation For 1<k < log*n, let ton 1 denote log***log n, 


and 4 denote ilogn Haat) toe © 4. Also, let n'= n/(2log*n). 


We shall divide the problem up into log*n cases. Case k, for 
lskslog*n - 1, makes the assumption 
y, Q(x; ) 2n'., 


i <4<i 
‘et? Jk 
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The last case, Case log*n , makes the assumption 


» Q¢x,) > n/2 - 1. 
0<j<2(q-1)+logn 


It is seen that these log*n cases do cover all possibilities 


since we always have 


: a) +  ) ae tere Facey 


i,si<i, i,Si<i, Osj<logn + 2(q-1) 
> = 
Q(x, ) (because toeen logn + 2(q-1). ) 
0<j<s 
= n-l 


= n/(2log*n) + n/(2log*n) + °** + (n/2 - 1). 


log*n - 1 


Now, Case k , isk<log*n - 1, are proved in the same way. 


Proof of Case k, lsk<log*n - 1 


There are two possibilities: 
. 2 s ' 
(I) The maximum of Q(x,), iy Sit, » is 2 n'/(q-1) 
If only we can show h (ni/(qe1)) = ha , then the desired result 


will follow from Proposition 1 of Sec. 3.4.1. Indeed, 


hy (n'/{ar1)) = logn - log(q-1) - 1 - log(log*n) - (q-3)1log(log*n) 


- 2log((q-1)!) - (q-1) 


43 


> logn - (q-2)log(log*n) - 2log(q!) + q 
=h 
q+il™ - 
Therefore we are done in this case. 
(II) The maximum of Q(x,), i 


S 4-<° a is <n'/(q-1). 


k+l k ’ 
In this case, let Q(x, ), ..-., Q(x. ) be the q-1 largest 
Jy Jq-1 
weights among Q(x,), jelip yy i) . (note iy ~ day 2q-1.) 


' = i =~ qj <ftsa-l. 
Then we must have Q(x, )= n'/((q YG, tap) for all 1<f<q-1 


£ 


Denote this last fraction by m. If we can show that ee a 


qrho yy), then the conclusion will follow from Proposition 2 of 


Sec. 3.4.1. Thus, since iyo fod < astiee © , and n'=n/(2log*n), 


(k+1) 


tad t con) hy Cm) 2 logn + (q-1)log n + (q-1)(logn -2log(q-1) 


~ tog), = atog((q-2)1) = q - (q-2)10g(1og*n)) 
= q(logn - (q-2)log(log*n) - 2log(qt) = q) 


7 org? ; 


This completes the proof of Case k, 1sk<log*n - 1. 


Proof of Case log*n 


In this case we have the assumption 


ye Q(x.) > n/2 - 1, (12) 
0<4<2(q-1)tlogn 


But by Equ.(5) of Sec.2.3 we have 


50s, yt Hdeets »| lognl -3 


vA 


Qtx,) 


0<j<llognl-3 a 
jflogni=2 _ 1 


= n/4 - 1, (13) 


From Equs. (12) and (13), we get 


» Q(x) > n/4. 
logn-3<j<logmt2(q-1) 


Hence the maximum of QO), logn=3 < 4 < logn+2(q-1), must be greater 
than n/8(qtl) . 

As in (1) of the proof of Case k, k < log*n , we need only show that 
h, (n/8(at)) = hea ™, then the result will follow immediately from 


Proposition 1 of Sec.3.4.1. The calculation is straightforward: 


He CuBCae)) 2 logn - 2 = log(2q+2) - (q-3)log(log*n) - 21og¢q~1) })-(q-1) 
2 logn - (q-2)log(log*n) - 210g (q!) -q 
=h (n). 


(We assume here n > 4 due to the remark following Equation (11).) 
This proves Case log*n, and the proof of Lemma 2 is now complete. 


45 


CHAPTER 4 


FINDING A MEDIOCRE PIAYER 


4.1 Introduction 


In the preceding chapters we have successfully extended Kislitsyn's 
lower bound to the case when, instead of finding the second best of n 
players, the i-th best is to be determined. We now turn our attention 


to another type of selection problems. 


Given i and j with i+ j+1< n, our objective now is to find a 
player who is neither among the i top players nor one of the j worst 
players. We shall say such a player is (i,j)-mediocre. Historically 
this problem is closely connected with the finding of the median ele- 
ment, whose starting point is usually the selection of an element that 


is not too close to either extreme. 


Technically, the selection of the i+lst largest element is a 
special case of this "mediocre player" problem in which n = it+j+l. 
This comnection suggests a way of finding an (i,j)-mediocre player. 

We simply pick i+j+l elements arbitrarily and select the i+lst 
largest among them; it is obvious that this element satisfies the 
"(i,j)-mediocre" requirement. The question that naturally follows is 
"Is this the best algorithm ?" In Sec, 4.3 we will prove that this is 


the case when i=1. We will also show that if the answer to this 


a6 


question is “yes” in genemeh, cn ie ts mien eta 
ion et tn on a lg A 


vithee in the feliauhag sectvinns. 
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4.2 Properties of S(i,j.n) 


For itjtl <n, let S(i,j,n) denote the minimum number of com- 
parisons needed to find an (i,j)-mediocre element as defined in Sec.4.1. 


The following facts are easy to verify: 


(i) S(i,j,n') < S(i,j,n) if nl2ne2 itjtl. 
This is true since from n!' elements we can arbitrarily choose 
n elements and in S(i,j,n) comparisons find a desired (i,j)- 


mediocre element. 


(ii) There is an integer Ny < 12(i+j+1) such that 
SCi,j.n) = S(i,4,M)) for all ne2 Ny 
Proof By the algorithm of Blum et al[1], it takes less than 
6m comparisons to find the t-th largest of m elements for any 
tsm. Since S(i,j,itjtl) = Vig Gtit), we thus have 
S(i,j,n) < S(i,j,itjt)l) < 6(itj+1) for all n = itjtl by (i). 
Since an optimal algorithm never consults more than 12(i+j+1) 
elements, we must have 


S(i,j,n) 2 S(i,j,12(itj+1)) for n2 12(itjtl). 


Thus, for fixed i, j, the function S is non-increasing in n and 


reaches a constant before n is 12(i+j+1). 
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As a result, we have 
Theorem 4 For ne2 itj+l, S(i,j,n) is independent of n if and only 
if selecting (optimally) the itlst largest element of an arbitrary 
subset of i+tj+l elements is an optimal procedure for finding an 
(i, j)-mediocre element in n elements. 
Proof If S(i,j,n) is independent of n, then 
SCi,j,n) = S(i,j,itjtl) = V4 Gt5+)) . 
This shows that the proposed algorithm is optimal. On the other hand, 


if the algorithm described in the theorem is optimal, we will have 


S(i,j,n) = V5 Gtit) for all n2 itjtl 


which shows that S(i,j,n) is independent of n. 
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4.3 S(1,j,n) = V,(5+2) 


Because of Theorem 4 of the last section, it becomes an intri- 
quing question whether S(i,j,n) is independent of n or not, ‘ In this 
section we will show that S(1,j,n) is independent of n. This 
lends support to the conjecture that S(i,j,n) depends only on i and 
j. We will discuss some implications of this conjecture in the next 


section. 
Theorem 5 S(1,j,n) = V,(j+2) = i + Mog (4+2)1 


Proof It suffices to show that $(1,j,n) = j + [log(j+2)1. For any 
given algorithm that finds a (1,j)-mediocre player, we shall apply 
Basic Strategy (cf. Sec. 2.2) to it and call the resulting tournament 


T. Suppose Xn is the (1,j)-mediocre player found in T, define 


A={x | either x=K, or x is determined to be inferior to 


x by T }. 


By the definition of “(1,4)-mediocre’ x, must be defeated at least once 


in T; hence none of the guys in A is an undefeated player. Also, let 


B= {y | y is undefeated at the end of T and has weight > 1}. 


Thus, if B= LYz> eoes yes then Yur coos Y, are those players corre- 


sponding to the reots that have at least one descendant (besides 


50 


itself) in the first-defeat-tree of T, Certainly AN B= 4@. 


Claim If for l<fx<r, Yy has weight m,, then we must have 


xr 
a (Log (m,) | = llog( +2) | ‘ 


g=1 
ky k +t 
Proof of Claim Suppose for l<fsr, we have 2 "<m, <2 » 
Then, 
E Mog(m,)1 = (kyt 1) + eee + (k + 1) 
g=1 4 r 
cot ia ae i k) ‘ (14) 
On the other hand, 
k,+1 k +1 
mo test tm <2 Hig wie ag 
r 
kt k +1 
<2 ooo 2 r 
(k + eso tk ) 
= ar. 2 1 r ; 
Hence 
Flog (m, f+ eee + m_)! srt (ky f$ eee + ke (15) 


However, since every player in A must be the descendant of some Vp» 
l<f<r, we have 


m tees tm, 2rt+[Aje2it+|al. (16) 


Since by the definition of (i,j)-mediocre we must have {Aj 2 j+1l, 
Equ.(16) becomes 
mo tees tm 2jr2. (17) 


The Claim then follows from Equs. (14), (15) and (17). 
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Now that Claim has been proved, Theorem 5 follows quite easily. 
Indeed, from Claim and Fact 2 of Sec. 2.2, we know that Yy2 sees 
have played at least [log(j+2)| matches all together. But from the 
definition of A, it follows that at least |A| - 1 matches have been 
played between the players of A. (one "crucial" match for each player 
in A other than xs Since A and B are disjoint, we see that this 


tournament must contain at least 


[log(j+2)1 + {A| - 12 [og(j+2)1 + j 


matches. Thus the proof of Theorem 5 is complete. 
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4.4 Connections with Median Computation 


In the last section we proved that S(1,j,n) is independent 
of n. This by no means implies S(i,j,n) is independent of n 
for other i as well. However, if this should indeed be true, it 


would have the following interesting consequence. 


Proposition If S(i,j,n) is independent of n for all i,j, then one 
can find the median of n elements by using no more than 3n compari- 
sons. 

Proof Let M(n) be the minimum number of comparisons for finding the 
median of n elements, Then, for a given n, to find an(n/2, n/2)-medi- 


ocre element ameng 3n/2 elements: one can proceed in two ways: 


1) Pick any n+1 elements and find their median. This requires 
M(rrt1) _comparisons in the worst case. 

2) Divide the 3n/2 elements into n/2 triplets and sort each 
triplet. Then take the central elements of all the triplets and 
find their median, This element is easily seen to be a desired 
(n/2, n/2)-mediocre element, and this method requires 


3*(n/2) + M(n/2) comparisons in the worst case. 


Now, if S(i,j,n) is independent of n then the first method is 


optimal by Theorem 4. Therefore we have 
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M(nt1) < 3n/2 + MCnABh 4 ea 
it can be proved by inmiuction that Min) < 9n. Of course we have 
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